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Abstract. In this article, an online manifold learning approach, implicitly reduced Picard iter- 
ation (IRPI), is developed for dimensional reduction of dynamical systems. The approach may be 
viewed as a variant of Picard iteration combined with a model reduction procedure. Specifically, 
starting with a test solution, we solve a reduced model to obtain a better approximation, and repeat 
this process until sufficient accuracy is achieved. The reduced model is obtained by projecting the 
full model onto a subspace which is spanned by the dominant modes of an extended data ensemble. 
The extended data ensemble in this article contains not only the state of some snapshots of the ap- 
proximating solution from the previous iteration, but also the associated vector field. Therefore, the 
proposed manifold learning procedure takes advantage of the information of the original dynamical 
system to reduce also the dynamics. Moreover, it is computed in the online stage, as opposed to 
offline learning in many projection-based model reduction techniques which needs prior calculations 
or experiments. It has been proved that the proposed sequence of functions converges to the actual 
solution of the original system as long as the vector field of full model is locally Lipschitz on an 
open set that contains the solution trajectory. Good accuracy of the proposed technique was demon- 
strated in several numerical examples from linear convection-diffusion equation to nonlinear Burgers 
equation. In order to save more computational cost, the IRPI technique is extended to a local model 
reduction approach by partitioning the entire time domain into several subintervals and obtaining 
a series of local reduced models with much lower dimensions. The accuracy and efHciency of the 
locally reduced model arc shown through the numerical simulation of several problems including the 
Navier-Stokes equation in a lid-driven cavity flow problem. 
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1. Introduction. Reducing statistical data redundancy in high dimensional 
problems could enable a low dimensional reduction without a loss of dominant in- 
formation contents. The problem of manifold learning and dimensionality reduction 
appears in many fields in science and engineering including pattern recognition and 
machine learning [11] , signal analysis [1] , image processing [34] , electrical power grids 
[251 1^ . structural dynamics [5] , and chemical reaction systems [TT] [3S| to list a 
few. The classical linear techniques for manifold learning such as principal compo- 
nent analysis (PC A) [25] [21] and multidimensional scaling [23] are efficient to discover 
the true structure of data lying on or near a linear subspace of the input space. In 
order to discover the nonlinear structures, many prominent nonlinear algorithms are 
proposed, such as Isomap |40j . locally linear embedding j35j and Laplacian cigenmaps 

In many applications of dynamical systems, conventional methods of direct nu- 
merical simulation of all scales in a system are often computationally intensive and 
they require massive computing resources. For this reason, during the past several 
decades, many efforts have been put forward to develop reduced models on large-scale 
complex systems. The core of model reduction is to provide an efficient computational 
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prototyping tool to replace a high order system of differential equations with a sys- 
tem of substantially lower dimension, whereby only the most dominant properties of 
the full system are preserved. Several model reduction techniques have been studied 
in various fields. Krylov subspacc techniques [BJ [7] extend the success of extracting 
accurate reduced models from linear systems to nonlinear ones. Method of balanced 
truncation [42( 1381 122j has been widely studied by the control community. Proper 
orthogonal decomposition (POD, also known as Karhunen-Loeve decomposition, sin- 
gular systems analysis, singular value decomposition and principle component analy- 
sis) [M] is a well-know method which has a wide application in CFD-based modelling 
and control [20l [36l [5l [T0| . Although POD, as a linear model reduction technique, al- 
ways looks for linear subspace instead of its curved submanifold, it is computationally 
tractable and able to capture the dominant patterns in a nonlinear dynamical system. 
A typical POD procedure involves an offline-online strategy. In the offline manifold 
learning stage, the standard singular value decomposition (SVD) is carried out to find 
a set of empirical eigenfunctions (also known as reduced order bases, empirical ba- 
sis functions, and empirical orthogonal functions). The data vectors, collected from 
experiments or numerical simulations of the full systems, approximately rely on a 
linear (or an affine) subspace spanned by empirical eigenfunctions with least mean 
square error for a specified dimension. In the online stage a reduced model obtained 
by the Galerkin projection is computed to obtain the solution. Recently, many new 
model reduction techniques based on POD were developed to reduce the complexity 
of evaluating the nonlinear term through the standard POD-Galerkin approach, such 
as missing point estimation [i], trajectory piecewise- linear approximation [311 133]; 
empirical interpolation method [51 118j , and discrete empirical interpolation |12j . 

Although the offline manifold learning approach can save more computational 
time for the online computation, it suffers from two main disadvantages. First, the 
construction of empirical eigenfunctions during the offline stage is expensive, as it 
requires experimental results or solutions of a number of large-scale systems in the 
high dimensional space. Secondly, there is no guarantee that the solutions associated 
with a different set of initial conditions or parameters rely on the same subspace 
spanned by prior empirical data vectors. Therefore, the classical POD method usually 
lacks robustness with respect to parameter change and can yield unpredictable results 
[3Tj . For these reasons, an efficient reduced model with high fidelity based on online 
manifold learning is preferable^ However, much less effort has been spent in the 
field of model reduction via online manifold learning. In [26[ solution vectors 
from previous time steps are extracted to reveal the spatial patterns at current step. 
Updated empirical eigenfunctions are computed immediately after the reduced-order 
equations are advanced for one step. Yet this method might not be feasible for 
nonlinear systems if snapshots in future steps leave the subspace spanned by solution 
vectors of pervious steps. In [30| dynamic iteration using reduced model combines 
the idea of waveform relaxation technique and model reduction, which simulates each 
subsystem connected to model reduced versions of the other subsystems. Even without 
any prior empirical data, this method can effectively reduce the dimension of large 
scale molecular systems to obtain a convergent solution. 

Both online learning and offline learning techniques invariably involve a trade- 
off between complexity and accuracy /realiability of the model. Although the former 
has better descriptive or predictive features, one usually resorts to a more complex 



^To avoid confusion, "online" in this article means that the manifold learning is carried out in 
the online stage, as opposed to the offline stage. 
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reduced model whieh may be even as expensive as to solve the full system. The 
balance between complexity and accuracy of reduced models raises three problems: 
1) Without prior empirical data, how can one find a subspace where the actual solution 
of a nonlinear system resides? 2) If a trial solution is given initially, how can one tests 
its accuracy? 3) Is it possible to formulate a fast online learning algorithm for model 
reduction which can achieve large speedups while maintaining high accuracy? 

To answer the first problem, a new framework of iterative manifold learning, 
implicitly reduced Picard iteration (IRPI), is proposed here for reduced modelling of 
high-order nonlinear dynamical systems. Similar to the well-known Picard iteration, a 
trial solution is set at the very beginning. In each iteration, a set of updated empirical 
eigenfunctions are constructed by extracting dominant modes from an extended data 
ensemble, then a more accurate solution is obtained by solving the reduced equation 
on a new subspace spanned by these empirical eigenfunctions. The extended data 
ensemble not only contains the state vectors of some snapshots of the solution in the 
previous iteration, but also the associated vector filed. Therefore, the manifold learn- 
ing process essentially takes advantage of the information from the original dynamical 
system. One could argue that this results in actively reducing the dynamics as well 
as the solution snapshots. In principle, any manifold learning techniques could be 
used to find a set of empirical functions of a subspace, and any projection methods 
can produce a reduced model. In this article, we take the standard POD-Galerkin 
approach for example. Both analytical results and numerical simulations indicate that 
a sequence of functions asymptotically converges to the solution of the full system. 

The above mentioned approach can also solve the second problem with a minor 
modification. If a test solution is given, a linear subspace can be constructed by 
extracting information from snapshots of the higher dimensional space. A posterior 
error estimation is given by the difference between the trial solution and the better 
solution obtained by the reduced model in the subspace. 

For the third problem, we seek an efficient online manifold learning approach 
with significant speedups. In many nonlinear problems, the solutions have significant 
variations as time proceeds. Accordingly, the relevant reduced model need calcula- 
tion with an increasing number of modes in order to capture the main properties 
of the entire trajectory. To this effect, a local model reduction technique based on 
IRPI, locally linear projection (LLP), is proposed where the whole entire domain is 
partitioned into several subintervals. The full model is, then, projected onto a set 
of local subspaces with significantly lower dimensions. The locally reduced model is 
updated when the solution moves from one subinterval to another. The efficiency 
of this method is illustrated in a numerical example: the two-dimensional lid-driven 
cavity flow problem. 

The remainder of this article is organized as follows. Since the model reduc- 
tion techniques in this paper fall in the category of projection methods, the classical 
Galerkin-POD approach and its ability to minimize truncation error is briefly re- 
viewed in section [2] Then a review of Picard iteration is presented in section [3] and 
for introduction to a reduced Picard iteration technique. A memory-efficient and 
iterative manifold learning technique is devolved in section |4j which is a modified 
version of the reduced Picard iteration. After presenting the algorithm in section l4Tl 
we provide analytical results of convergence in section 14.21 and a complexity analysis 
in section 14.31 We also illustrate that the proposed method has a high accuracy for 
linear convection-diffusion equation and nonlinear Burgers equation in section In 
section [SJ the locally linear projection approach is proposed to save more computa- 
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tional cost. The performance of this technique is evaluated in a hd-driven cavity flow 
problem. Finally, conclusions are offered in section [B] 

2. Background on Model Reduction Techniques. In physical world, the 
description of many dynamical systems involves solving partial differential equations 
(PDEs) with first order time-derivative terms. These include linear heat equation, 
Schrodinger equation. Burgers equation, and Navier-Stokes equations to name a few. 
After spatial discretization (for example, using finite difference or finite element meth- 
ods), the full system projected on R" is given by a set of ordinary differential equations 
(ODE) and its initial condition, 

(2.1) x^f{t,x); x{0)^xo. 

Here a; : [0, T] M" and the discretized vector field / : [0, T] xR" ^ R" is an algebraic 
operator. To indicate the dependence of x on its initial value, let w{t]Xo) x(t) be 
a function of both t and xq ■ The evolution rule can be considered as a mapping from 
the phase space to itself which is parameterized by time. By definition. w{t;xQ) is a 
flow which gives an orbit in R" as t varies over [0, T] for a fixed xq. The orbit contains 
a sequence of states (solution vectors) that follow from xq. 

If / is a nonlinear operator and the dimension n is large, lower dimensional models 
that approximate the full system are often beneficial in terms of computational saving. 
As a classical projection-based model reduction method, the Galerkin-POD method 
is briefly reviewed here, for its capability of providing an accurate reduced model with 
significantly lower dimension, also because of its application in our online manifold 
learning approach for model reduction. 

2.1. Galerkin Projection. Galerkin projection can provide a lower dimen- 
sional approximation by projecting the full system onto a linear (or an affine) sub- 
space. For any n dimensional linear subspace S, there exists an n x k orthonormal 
matrix $ = (0i, (jfk), whose columns form a complete basis associated with it. The 
orthonormality of the column matrix requires that = Ik where Ik is the k x k 

identity matrix. Any state x E R" can be projected onto the subspace S" by a linear 
projection mapping, 

(2.2) z = 

where z e R'^ is the coordinate in the subspace coordinate system and the superscript 
T denotes matrix transpose. In the original coordinate system, this point can be 
expressed as 

(2.3) i = 

Substituting (|2.2p into (j2.3p . we immediately obtain the projection x € S oi sl point 
X in the original coordinate system, i.e., 

(2.4) i = Px, 

where P = <I><I>-^ G R"^". Suppose a vector field fk{t,x) in the subspace S is con- 
structed by fk{t,x) = ^^f{t, <I>z), then a reduced model of the original system (|2.ip 
can be obtained. 



(2.5) 



i = $^/(t,$z); zo = <P^xo. 
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Figure 2.1. ( Color online.) Illustration of the actual solution x{t) for the original system i2.H) , 
the projected solution x{t) on the subspace S jg.^l ) and the approximating solution x{t) computed by 
the reduced model h2. 61) . The component of error parallel to S is given by ei{t) = x{t) — x{t), and 
the component of error orthogonal to S is given by eo{t) := x(t) — x{t). Reproduced from \31tl . 

Hence, the reduced solution in the original coordinate system x{t) ~ $z(t) G R" is 
equivalent to the following initial value problem 

(2.6) i = Pf{t,x)] xo=Pxo. 

The error of the reduced model formed by Galerkin projection can be defined 
as e{t) := x{t) — x{t). Let eo{t) := (/ — P)e{t), which denotes the error component 
orthogonal to the projected subspace S, and ei{t) :— Pe{t), which denotes the compo- 
nent of error parallel to S. Thus eo{t) and ei{t) are orthogonal to each other. eo{t) is 
the error between x{t) and its projection x{t) onto 5*, which comes from POD or other 
dimensionality reduction techniques. But since the system is evolutionary with time, 
further approximations of projection based reduced model result in an additional error 
e,{t) (see. Figure [3T]- 

We can also project the full model onto an affine subspace by shifting S by x, 
where x is the mean value of x over a given time interval [0, T], and given by 

(2.7) ^^tI 

In principle, there is no fundamental difference between a linear projection and an 
afBne projection. (|2.2|) through (|2.6|) are still valid if we apply the following substitu- 
tions, 

(2.) .^(^),.^(;)..^(: f), *--(f -=-) 

where z — ^^x. Similar transformation rules can be applied for x and x. 

Although a reduced model can be constructed by Galerkin projection on an ar- 
bitrary subspace, its solution might not approximate the original one with a high 
accuracy unless the subspace is (approximately) invariant of the solution w{t,XQ). In 
this article, a subspace S* called is invariant of w{t,xo) (or invariant, for short) if 
xq G 5*, and the solution orbit given by wit, xq) = x{t) lies on S* , 



(2.9) 



P*x = X, 
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where P* G M"^" is the projection operator corresponding to S* . In this case, 
the governing equation of the reduced model (|2.6p and the full model (|2.1[) must be 
identical, which requires P* satisfies the following condition, i.e., 

(2.10) {P*)f{t,x)^ f{t,x). 

Equations (|2.9p and (|2.10p means an invariant projection operator P* not only pre- 
serve the state vector, but also preserve the vector field, i.e. the dynamics. Since an 
invariant subspace S* is uniquely determined by its basis vectors, the next subsection 
discuss the classical POD method which can construct empirical eigenfunctions from 
a data ensemble. 

2.2. Proper Orthogonal Decomposition. The idea of POD is to deliver a 
set of empirical eigenfunctions that the error for representing the given data onto the 
spanned subspace is in a certain least squares optimal sense [50]. Suppose a data 
ensemble is given by x{t) £ R" with t € [0,T]. POD seeks a projection P of fixed 
dimension k that minimizes the total error |20] 

(2.11) / \\x{t) - Px{t)\\^ dt. 



Jo 

After defining an n x n autocorrelation matrix 

(2.12) R := x{t)x{tf dt, 

Jo 

a necessary condition to minimize the error (j2.1ip is to solve the eigenvalue problem 

(2.13) = A,(A,, Ai > ... > A„ > 0, 

and construct the optimal projection operator P G M"^" through 



(2.14) P = Y^ 



k 

i=i 



In the numerical simulation, it is more feasible to rewrite the entire data ensemble 
x{t) in a discretized form, 

(2.15) X:= [a.(ti)v^,...,x(i™)v/^], 

where ii, ...tm arc discrete temporal points, and ■■■5m are quadrature coefficients. 
In this case, the integral in (|2.12|) may be written as i? = XX^ , and the empirical 
eigenfunctions {ipiVi^i can be computed by (|2.13|) . However, when the number of 
snapshots m is smaller than the dimension of states n, it is more efficient to solve the 
following n X m singular value decomposition (SVD) problem, 

(2.16) X = Uk-^'^V'^, 

where U e M"^'' and V G K™x'" are orthogonal, and A G M'"^'' is a diagonal matrix 
which consists of r = min(n, m) nonnegative diagonal elements arranged in decreasing 
order, i.e. Ai > ... > A^ > 0. These singular values in A are the same as in (|2.13p . If 
the first k singular values are significantly larger than the rest, one can take a good 
approximation of X by only calculating k column vectors of U and V corresponding 
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to the k largest singular values. This can be much quicker and more efficient if k ^ r. 
Let Ui be the ith column of U, then the POD basis calculated by the truncated SVD 
is given by 



(2.17) $ := [ui,...,ufc] e 



Although the truncated SVD is no longer an exact decomposition of the original 
matrix X, but it provides the best approximation to the original data X with least 
square error for the same dimension. In the rest of this paper, a SVD process refers 
to truncated SVD unless specified otherwise. 

Let Er denotes the energy of the full system, 

(2.18) [ ||a:(Of di = ^A„ 

and Ek denotes the energy in the optimal k dimensional subspace, 

(2.19) Ek= / ||Px(t)f dt = Va.. 

■^0 ^=l 

A criterion can be set to limit the approximation error in energy by a certain fraction 
77. Then, we seek A: ^ r so that 

(2.20) 1^ > 7?, 

where k represents the number of empirical eigenfunctions necessary for an approxi- 
mation with relative mean square error less than 1 — rj. The POD basis is optimal for 
model reduction since no other linear basis can represent better energy approximation 
in the same dimension. 

The key idea of POD and other projection based reduced models is to find an 
(approximating) invariant subspace on which all the solution vectors live. Fortunately, 
for most large scale complex systems although the states are in general represented 
by a high dimensional space, the solution orbit is actually embedded in an invariant 
subspace with significantly lower dimension. Therefore, it is possible to construct an 
accurate reduced model by forming a projection. In the past decades, many offline 
manifold learning techniques are developed to approximate the invariant subspace 
through a pre-computed data ensemble [3] . Although these methods can significantly 
increase the computational speed during the online stage, the cost of construction 
of data ensemble in the offline stage is often very expensive. Moreover, there is no 
guarantee that the pre-computcd data vectors can span an invariant subspace for the 
same dynamical system but with different parameters. For these reasons, it is desired 
to develop an online manifold learning technique with inexpensive cost. As shown in 
the next section, Picard iteration can be employed to obtain POD-required snapshots 
from the data in the previous iteration. Consequently, it is possible to construct an 
approximating invariant subspace without a precalculated data ensemble obtained in 
the offline stage, as shown by the reduced Picard iteration (RPI) approach. 

3. Reduced Picard Iteration (RPI) Approach. In this section, we begin 
with a brief review of the well-known Picard iteration process for solving the initial 
value problem. Subsequently, the RPI approach is introduced by combining Picard 
iteration with some model reduction techniques. 
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3.1. Picard Iteration. Picard iteration is a constructive procedure for estab- 
lishing the existence and uniqueness of a solution to an initial value problems. For- 
mally integrating the equation in (|2.ip with respect to t yields 

(3.1) xit)=xo+ I /(r,x-(T))dr. 

The right side of (|3.ip can be viewed as an operator, T , acting on the function x(t) 

(3.2) T{x)=xa+ f /(T,x(r))dT. 

Picard iteration process starts with selecting a test function x^{t). Then, the operator 
T is applied to this test function to obtain a better approximation, x^ = T^x'^). 
Recursive application of this operator generates a sequence of functions 

(3.3) x^t)^T{x^-'){t), i = l,2,..., 

where the superscript j denotes the jth iteration. The following theorem summa- 
rize existence and uniqueness of solutions to first-order equations with given initial 
conditions. 

Theorem 3.1 (Picard-Lindelof Existence and Uniqueness [17]). Suppose there 
is a closed ball of radius b around a point xq € M" such that / : Jq x Bh{xo) — > R" 
is a uniformly Lipschitz function of x £ Bh(xo) with constant K, and a continuous 
function oft on Jq = [—a, a]. Then the initial value problem 112. 1\) has a unique 
solution x{t) for t S Jq, provided that a = b/M where 

M= max \\f{t,x)\\. 

In the study of differential equations, this theorem is an important application of the 
the Banach fixed point theorem to a sequence of functions defined by Picard iteration. 
The fixed point x* — T{x*) is proved to be the solution to the initial value problem 
(j2.ip . The Picard iteration technique (algorithm[l]) is reproduced here for comparison 
with algorithms in the following sections, and also because our methods use ideas 
from Picard iteration. 



Algorithm 1 Picard iteration 
Require: The initial value problem (|2.ip . 
Ensure: An approximating solution x{t). 

Set a test function Sp{t) as the trial solution. Initialize the iteration number j ~ 0. 

repeat 

1: Update the iteration number j = j -I- 1. 

2: Compute x^ {t) = T{x^^^){t) p.2p to update the approximating solution, 
until \\x^ — x-'^^ll < e, where e is the tolerant error. 
Obtain the final solution x(t) — x->{t). 
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3.2. Reduced Picard Iteration. Even though Picard iteration has good asymp- 
totic eonvergenee properties, it has httlc practical value for solving initial value prob- 
lems because recursive time integrations in a high dimensional space are very ex- 
pensive. Therefore, it is desired to combine model reduction techniques and Picard 
iteration for cost saving purposes. After rewriting (j3.3p in a differential form, 

(3.4) ±^ ^ f{t,x^~^); x^{Q) = xo, 

we can form a projection of p.4p and obtain a reduced model in the original coordinate 
system, 

(3.5) = P^fit, P^x'-^); xl = P^ixo). 

The associated reduced model in the subspace coordinate system is 

(3.6) = (<i>^f /(t,$^F-i); z^-$^'(xo), 
with a solution 

(3.7) z^it) = + (ci>^)^ r /(T,<I)^F"i(r)) dr, 

Jo 

where z^~^ is the transformed coordinate of z^^^ from S^~^ to , 

(3.8) S^-^ = ($^)^(i/^'^^) = ($^')'^$^-i(z^-^). 

Equations (jO)) . ((3^ . and ([33)1 respectively correspond to ((2TT|) . ([231) . and ((23)) . 
S'-' is an approximation for invariant space 5** at iteration j. Parallel to p.9p and 
(f2J0l) . we have 

(3.9) P^x^-^)^x^-\ 
and 

(3.10) P^(/(t,x^-i)) = /(t,£^-i). 

Therefore, both the state vector x^~^ and the vector filed f{t,x^~^) are invariant 
under P^ . Let Jm be the maximal open interval containing t = while a unique 
solution exists for the original system p.ip . J,{j be the corresponding interval for the 
reduced model, and dM]), at iteration j. J := J^^r] [0, T], and J-?' := n [0, T]. 
5^ can be expressed as 

(3.11) = span(X^-\F^"i), 
where 



(3.12) ■.^{x^t):te.P}, 
is the approximating orbit at iteration j, and 

(3.13) ■.^{f{t,x^{t)):teJ'}, 



is the associated vector field along the approximating solution. 
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If the solution orbit is given as "snapshots" x{ti) at discrete times ii, then 
one can assemble the data into an n x m matrix 

(3.14) X' := [x'{h)^/s'i,...,x'{t^)^/sZ]. 

Accordingly, samples of vector field along the approximating orbit can form another 
n X m matrix, 

(3.15) := [fih,x^ih))^u-,.f{tra,x^Un))y/S^]. 

A combination of and gives the information matrix, which is used to represent 
an extended data ensemble 

(3.16) f^' [X^7F^'], 

where 7 is a weighting coefficient. 7 = 1 is a typical value to balance the truncation 
error of X^ and F^ . By p. lip , we have 

(3.17) 5^ «span(y^-i), 

which means the basis vectors of can be obtained by SVD of Y^~^. However, it 
should be emphasized that the POD (or SVD) method is not the unique method to 
construct the empirical eigenfunctions from the information matrix, and in principle 
it can be substituted by many other snapshot-based model reduction techniques in 
[3] . Algorithm [2] lists the procedures of the RPI approach. 



Algorithm 2 Reduced Picard iteration (RPI) 
Require: The initial value problem (|2.ip . 
Ensure: An approximating solution x{t). 

Set a test function a;"(t) as the trial solution. Initialize the iteration number j — 0. 

repeat 

1: Update the iteration number j ~ j + 1. 

2: Assemble snapshots of an approximating solution x^~^{t) into matrix form 

3: Compute vector field matrix F^^^ whose columns associated with snapshots 
in X^-^ 

4: Form an information matrix for the extended data ensemble Y^^^ = 
[XJ-i, 7/^^-1]. 

5: Based on Y^^^, compute the empirical eigenfunctions $^ through POD or 
other model reduction techniques. 

6: Project the original equation onto a linear subspace spanned by $^ and form 
a reduced model (|3.6|) . 

7: Use (|3.7p to evolve the approximating solution z^{t) for all t £ . 

8: Express the updated solution in the original coordinate system (i) = $^ z^ (t). 
until \\x^ — x^~^\\ < e, where e is the tolerant error. 
Obtain the final solution x{t) = x^it). 



When the width of all the sub-intervals 5i,{i = l,...,m) approaches zero, the 
subspace spanned by F-^^^ converges to In this case, RPI keeps all the information 
of the original Picard iteration, and x^ « x^ is valid for each iteration. Furthermore, 
time integration in algorithm [2] is carried out in a reduced subspace, resulting in 
significant computational saving as compared with algorithm [T] 
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Table 4.1 

Comparison of direct method, Picard iteration and implicit Picard iteration methods. Each 
technique has the full model (first row), the reduced model in the original coordinate system (second 
row), and the reduced model in the subspace coordinate system (third row). 



direct method 


Picard iteration 


implicit Picard iteration 


i = (EH 
i = pf{t,x) (EU 

i = $7^/(t,$z) (1^ 


= f{t,xi-^) dSH) 

= P3f{t,P^x^-^) dm 
ii = dsn 


x^f{t,x) dlH 

= pi f it, x^) gH 

) iJ = dm 



4. Implicitly Reduced Picard Iteration (IRPI) Approach. Although RPI 
achieves significant saving as compared with the original Picard iteration, it still has 
room for substantial improvement. First of all, RPI requires allocating large memories 
to record the entire trajectory for each iteration in order to obtain a better approxima- 
tion for the next iteration. Secondly, algorithm [2] introduces an extra step, coordinate 
transformation (j3.8p from Si~^ to , for each iteration which introduces extra com- 
putational overhead. For these reasons, an implicit time integration scheme is applied 
for a modified version of RPI, in which time integration is computed independently 
with the approximating trajectory obtained in the previous iteration. 

4.1. Algorithm. Algorithm [5] uses an explicit time integration scheme to iter- 
atively update the approximating solution (|3.7p . For the purpose of memory saving 
and avoiding coordinate transformation from one subspace to another, a much more 
efficient algorithm, implicitly reduced Picard iteration (IRPI) method, is proposed in 
this section. It is noticed that if x^~^ is replaced by x^ , then the reduced model p.Sp 
becomes 

(4.1) =Pif{t,xi); xl = Pi{xo). 

This can be considered as a modified version of RPI. However, time integration p.7p 
can not be calculated explicitly here. Let = {^^)'^x^ , an analogy to p.6p gives the 
reduced model in the subspace coordinate system for iteration j, 

(4.2) = i'f' ffit, -f'z^); 4 = i-^'fi^o). 

Equations (|4.ip and (|4.2p can be directly obtained by projecting the original system 
(|2.ip onto . Compared with Picard iteration and RPI, IRPI only records a set 
of empirical eigenfunctions rather than the entire trajectory at each iteration, and 
seeks a better approximation by projecting the full model onto an updated subspace 
spanned by these empirical eigenfunctions at each iteration. Algorithm [3] reflects this 
change and list comprehensive procedures of the IRPI technique. 

Table 14.11 compares the governing equations for these three techniques: direct 
method, Picard iteration and implicit Picard iteration. The full model of IRPI is the 
same as the one of the direct method. However, different from the reduced model of 
direct method, IRPI is an iterative solver, using an updated projection matrix at each 
iteration. 

In the rest of this subsection, we discuss the geometric meaning of this algorithm. 
During each iteration, a new subspace is constructed followed by an approximating 
solution a;^"^(t). As x^ (t) x{t), — > S*. For this reason, IRPI is an iterative 
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Algorithm 3 Implicitly reduced Picard iteration (IRPI) 
Require: The initial value problem (|2.1|) . 
Ensure: An approximating solution x{t). 

Set a test function x^{t) as the trial solution. Initialize the iteration number j = 0. 

repeat 

1; Update the iteration number J = J + 1. 

2: Assemble snapshots of an approximating solution x^^^{t) into matrix form 
A^-i. 

3: Compute vector field matrix F^~^ associated with snapshots in X^~^. 

4: Form an information matrix for the extended data ensemble Y^~^ = 

5: Based on Y^^^, compute the empirical eigenfunctions <i?^ through POD or 
other model reduction techniques. 

6: Project the original equation onto a linear subspace spanned by and form 
a reduced model (|4.2p . 

7: Solve the reduced model and obtain an approximating solution {t) in the 
subspace coordinate system for all t G . 

8: Express the updated solution in the original coordinate system x^ (t) = (t) . 
until \\x^ — x^~^\\ < e, where e is the tolerant error. 
Obtain the final solution x{t) = x^{t). 



manifold learning procedure, which approximates the invariant subspace by a scries of 
subspaces. For a complete iteration cycle we begin with collection of snapshots from 
the previous iteration (or an initial test function) , then construct a subspace spanned 
by a matrix which contains both the solution and the associated vector field, and then 
generate empirical eigenfunctions by POD or other techniques, and end with solving 
a reduced-order equation. 

According to algorithm^ a trial solution is given initially. One simplest approach 
is to set it as a constant, i.e., x^{t) = xq. Thus, is spanned by the initial condition 
xq and the initial vector field /(0,a;o). Then, the initial empirical eigenfunctions 

can be calculated by SVD of the initial information matrix F° = [xq, f{0,xo)]. 
Projecting the full system onto the produces a reduced model. Consequently, a 
better approximating solution x^{t) can be obtained. In fact, solution vectors near the 
starting point approximately reside in S^, which can be justified by Taylor expansion 
of x(t) to the first order approximation, x{t) = xo + /(O, Xo)t + O(t^). However, many 
consequent states do not reside in and the error in x^{t) is expected to increase 
significantly after a few steps. In the second iteration, the aim is to obtain a better 
approximation for the invariant subspace S*. The solution matrix can be 
constructed by extracting m solution vectors from the x^{t). A large number for m 
will lead to intensive computation, but the selected snapshots should reflect the main 
dimensions of the orbit. Since all the projected solution vectors in A^ exactly lie on 
5^, decomposition of X^ through SVD will not achieve empirical eigenfunctions that 
can span a different subspace. Thus, we seek some other solution vectors that do not 
lie on S^. Notice that ii x{ti) {i G {1, .., m}) is a solution vector that (approximately) 
lies on the invariant subspace S* , so does x{ti) + f{ti, x{ti))St. Therefore, the matrix 

Ai + F'6t = [...(x^U) + fiU,x\U))St)^,...], 
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Figure 4.1. Graphical description of IRPI. A trial solution Sf'it) = xq is set initially. For 
each iteration, a reduced model is obtained by projecting the full model onto . x^(t), represented 
by a curved line, is computed by the reduced model. After extracting snapshots from x^{t) and 
constructing an information matrix of extended data ensemble, a better subspace S^^^ is obtained, 
A seguence of the approximating solutions x^ (t) converges to x{t) asymptotically, 

can also be used to provide more basis functions. Thus, a better subspace can be 
expressed as 



which is the case when j = 2 in p.l6p . 5^ is a hnear subspace that contains both 
solution vectors obtained from iteration 1 and the neighbors of these vectors. Decom- 
position of through POD gives an updated matrix $^ for empirical eigcnf unctions. 
Then an updated solution x'^{t) can be obtained. Now, wc can repeat this process 
to obtain solution matrix for the next iteration. In the next subsection, we shall 
prove analytically that the proposed approach can achieve convergent solution for the 
full system. 

4.2. Analytical Results. This subsection presents some analytical properties 
of IRPI. Since the algorithm use the classical Galerkin-POD approach for the online 
manifold learning, we first prove the existence and uniqueness of the solution of the 
reduced model formed by a Galcrkin projection. Then wc follow |31j to analyze the 
error introduced by POD. After that, we shall prove the convergence of the sequence 
of functions obtained by IRPI. Theorem 13.11 states that if the vector field is locally 
Lipschitz, then the original system exists a unique solution. This condition is also 
sufficient for the existence and uniqueness of the solution in the reduced model. 

Lemma 4.1 (Local Existence and Uniqueness of reduced models). Let Bi,{xo) 
be a closed hall of radius b around a point xq G M" and / : Jo x Bf,{xo) — >■ M" be 
a uniformly Lipschitz function of x with constant K, and a continuous function oft 
on Jo ~ [—a, a]. Then the associated reduced model x ~ Pf{t,x) \2. 6]) has a unique 
solution at the interval t E Jq for a given initial condition xq — Pxq, provided that 





=span{X\X^ + F^St) 



span(y^). 
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a — b/M where 

(4.3) M= max \\f{t,x)\\ 

{t,x)eJoxBb{xo) 



Proof. In order to prove this lemma by theorem lS-H we first show that the reduced 
vector field is locally Lipschiz. This is because f{t, x) is a uniformly Lipschitz function 

for all (t, x) Cz Jo X Bb{xo), for every point xi,X2 G Bb{xo) and t € Jq, then we have 

\\f{t,Xi)-f{t,X2)\\<K\\xi-X2\\. 

Recall that P is a linear projection matrix, and ||P|| = 1. As a consequence, 

\\Pf{t,Xi)-Pf{t,X2)\\ < \\P\\ \\f{t,Xi)~f{t,X2)\\<K\\xi-X2h 

which justifies that the projected vector field Pf{t,x) is Lipschitz with constant K 
for all {t, x) G JqX Bb{xo). Then we can apply the theorem l3.1l to yield local existence 
and uniqueness of the solution of the reduced-order equation x = Pf{t, x) (|2.5p for 
t e [—a', a'], where 



max \\^f{t,x)\\' 

{t,x)eJoxBb{xo} 

Since < \\f{t,x)\\, we have a' > b/M where M is defined by (gS]). There- 

fore, there exists a unique solution for the reduced-order equation in the interval 
J = [—a, a], where a — b/M . □ 

The next few theorems discuss the error of IRPI e{t) = x{t) — x{t). Since the 
reduced model in IRPI is formed by Galerkin projection, as discussed in Section, 
the e(t) can be decomposed into two parts: eo{t), the component orthogonal to the 
projected subspace S, and ei[t), the component parallel to S. It will be shown that 
ei{t) depends on eo(i). Substituting ()2.ip and p.6|) into the differentiation of eo{t) + 
ei{t) — x{t) — x{t) yields 

(4.4) eo + e,=Pf{t,x)- f{t,x). 

Using P^ = P and left multiplying (j4.4p by P, we can derive the governing equation 

for ei{t), 

(4.5) e^{t) = P{f{t, x + eo + e,)- /{t, x)), 

where x{t) and eo{t) can be considered as forcing terms. Proposition 4.2 in [3T] reveals 
that the error of ei{t) is bounded by eo{t) and x{t). In order to introduce the following 
convergence theorem, we restate this property with minor modifications. 

Lemma 4.2. Consider solving the initial value problem given by h2.1\] using the 
reduced model in the interval Jq = [0, a]. Let P G M"^" be the projection matrix 
associated with a linear subspace S . Let x{t) be the solution of the full model, x{t) = 
x{t) + eo{t) be the projection of x{t) onto S, and x(t) = x{t) + Coit) + ei{t) be the 
solution of the reduced model. Let K be the Lipschitz constant of f{t,x) and 

\\f{t,X^)~ f{t,X2)\\<K\\x^~X2\\ 



AN ONLINE MANIFOLD LEARNING APPROACH FOR MODEL REDUCTION 



15 



for all (t, Xi), (t, X2) E D d Jq X M", where the region D contains (t,x{t)), {t,x{t)) 
and (t, x(t)) for all t G Jo- Then the error e — e^, + e.^ in the infinity norm is bounded 
by 

(4.6) ||e|L<e^'^I|e„|U + ||e,(0)I|. 



Proof. The governing equation of ei(t), given by (|4.5|) . yields 

\\eS + h)\\ = \\e,{t) + hPf{t, x + eo + Ci)- hPf{t,x)\\ + 0{h^) 
< \\e,{t)\\+h\\Pf{t,x + eo + e^)-Pf{t,x + eo)\\ 
+h\\Pf{t,x + eo)-Pf{t,x)\\+0{h'-). 

Applying the Lipschitz conditions, wc get 

\\e,{t + h)\\ - \\e,{t)\\ /.Ml, 7,. II t^\i\,nth\ 
^- < K\\ei{t}\\+K \\eo[t)\\ +0[h). 

Since 0{h) can be uniformly bounded independent of e(i), we can obtain 

||e.W||<A^ Te^^*-^) ||e„(r)|| dr+||e,(0)||. 
Jo 

By definition, ||eo||j^ > j|eo(t)|| for any t G Jq. It fohows that 

||e,||^ < /°+'^ eW-^) dr + ||e.(0)|| 

= (e^"-l)l|eolL + ||e.(0)||. 

Using e{t) — eo{t) + ei{t), we have 

||e|L<l|e.lL + l|eolL<e^'^|leo|L + ||e,(0)||. 

Therefore, an upper bound for ||e||j^ is obtained. □ 

Lemma |4?2] imphes that as long as el{t) — !> as j 00, then e^ (i) — > 0. Now, we 
are ready to discuss the main property of IRPI. 

Theorem 4.3 (Local Convergence of IRPI). Consider solving the initial value 
problem given by Let B^ixQ) denote a closed ball of radius b around a point 

.To G K" and Jq be a closed time interval, Jq = [0,a]. Suppose f{t,x) is a uniformly 
Lipschitz function with constant K for all {t,x) G Jq x Bb{xQ). The LRPI approach 
is applied to obtain an approximating solution. At iteration j , let x{t) be the solution 
of the full model, x^{t) = x{t) + e^(t) be the projection of x{t) onto , and x^ (t) = 
x{t) + e-l{t) + ej{t) be the solution of the reduced model. Ln an ideal situation, xq € , 
while the solution and vector field satisfy x^~^(t) C and f{t,x^~^) C for all 
t G Jq, respectively. Then the sequence of functions x^ (t) uniformly converges to x(t) 
for all t € Jq provided that 



(4.7) a < min 



M' K 



where M is given by 
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Proof. Since f{t,x) is a uniformly Lipschitz function for all {t,x) £ Jq x Bi,(xq) 
with constant K, theorem 13.11 implies the existence and uniqueness of the solution 
x{t) of the full model (|2.1I) . Then we can uniquely determine x{t) by x{t) ~ P^x{t). 
Meanwhile, by lemma I4TT1 the projected vector field Pf{t,x) is also Lipschitz with 
constant K for all (i,x) S Jo x Bb{xo) and there exists a unique solution x^ (t) for the 
reduced model p.6|) . Therefore, x{t), x{t) and x^ (t) uniquely exist for all t G Jo- 

Next we discuss the error e{t) = x^{t) — x{t) of the reduced model in the interval 
Jq. As mentioned before, the error has orthogonal components eo(t) and parallel 
components ei{t). Multiplying (|4.4p on the left by / — P-', we obtain the governing 
equation for Coit), 

(4.8) ei^-iI-P^)f{t,x). 

Since xq £ , we have P^xq = xq. Therefore, the initial condition is given by 
e^(0) = 0. Since the vector field f{t,x^^^) is invariant under the projection operator 
P^ (see 13.10]) . we have 

{I-P^)f{t,x^-')=0. 

It follows that 

(4.9) ei = {I- P^)[f{t, x + e^-')- f{t,x)]. 
For h > 0, Taylor expansion of e^(i) gives 

(4.10) \\ei{t + h)\\< \\eiit)\\ + \\hil - P^)[f{t, X + e^-') - /(t, x)] \\ + 0{h'). 

Considering x(t) G Bt,(xo), x^~^{t) G Bi,{xo), and f{t,x) is a uniformly Lipschitz 

function for all {t,x) G Jo x Bb{xo) with constant K, it follows that 

(4.11) 

||(J - Pn[f{t,x + e^-') - f{t,x)]\\ < \\f{t,x + e^-') - f{t,x)\\ < K \\e^-\t)\\ . 
We can combine (|4.10p with (|4.1ip to obtain 

(4.12) Mi±^t_^£M<ir||e^-i(t)||+0(/,), 

h " " 

where the 0{h) term may be uniformly bounded independent of e^^^(<). Integrating 
(|4.12p with respect to t yields 



(4.13) \\eiit)\\<K[ \\e^~^{T)\\ (It < Ka\\eJ-^l . 

Jo °° 



As f(t,x) is a Lipschitz function for all {t,x) G Jo x Bi,{xq), we have x{t) G Bi,{xo), 
x{t) G Btixo), and x{t) G Bb{xo). Notice that 6^(0) = since the starting point Xq is 
the projection of xq onto . 

In order to apply lemma l4?2l we will show that x{t) G Bh{xo). Provided a < b/M, 
this is valid since 



(4.14) \\x{t)-xo\\ = 

In addition, we have 



f{T,x)dT 







< / ||/(r,x)|| dr < Ma < b. 







(4.15) 
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which means that (t) G B(,[xq). Shnilarly, since 



(4.16) 



\x^(t)-xo\\ = \\x^ (t) - xo\ 



Pfir, x^)dT 



we achieve {t) e Bh{xo) for all t e Jo- As a consequence, we can combine (|4.6p in 
lemma 22] with (|4.13|) to obtain 



(4.17) 
Since 



Kae'^'' < Kae""'/" < 1, 



then a sequence of functions ||e^(<)|| uniformly converges to 0, which means the se- 
quence of functions x^ {t) uniformly converges to the fixed point x{t) for all t G Jo- □ 



Remark 4.4. Ideally a linear subspace at iteration j contains an approximat- 
ing solution x^^^{t) with the associated vector field f-'^^{t) of the previous iteration 
for all t ^ Jq, which is the case for equations p.9p and (|3.10p . However, it is more 
often to form S-' by extracting the first few dominant modes from the matrix of ex- 
tended data ensemble p.l6p . In this circumstance, (|3.9p and (|3.10p should be replaced 

by 

(4.18) P^{x^-'^)^x^'\ 
and 

(4.19) P^(/(t,x-^--i))«/(t,£^-i), 
respectively. To quantify the error from this modification, we define 

(4.20) \\{l-P')x'-\T)fdT+^[ ||(/-PJ)/(T,xJ-i)f dr. 
a Jo ^ Jo 

If SVD is used to construct the POD basis, e can be estimated by 

n 

(4.21) s= Xl 

where is the dimension of subspace and A| is the ith eigenvalue of autocorrelation 
matrix for the extended data ensemble 

(4.22) = [x^-\t),jf{t,x^-% 

If the fraction error in energy truncation that SVD gives is bounded by 1 — 77, one 
obtains 

(4.23) e< (l-7y)i?^", 

where i?^ , as defined in (|2.18p , denotes the total energy of the extended data ensem- 
ble . In practice, it is more tractable to take the discrete form Y^ (|3.16p as an 
approximation for Y^ . Accordingly, a more strict bound on e is given by 

(4.24) e<pil-r^)Ei, 
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where p is a cocfScient associated with temporal discretization. Let d be the maximal 
time interval among the neighboring snapshots. If f{t,x) G C^{Jq x Bij{xq)), then we 
can obtain 



(4.25) 



6 

1 H — max 

a t£Jo 




which implies that p ^ 1 as 5 —5- 0. Since the IRPI approach uses a snapshots- 
based reduced model, such as POD, for each iteration, consequently the sequence of 
functions can not converge to the exact solution of the original system. However, as 
proposition l4.5l indicate. we can always bound the error of the approximating solution 
obtained by IRPI. 

Proposition 4.5. Consider solving the initial value problem given by Let 
Bb{xo) denote a closed ball of radius b around a point xq G M" and Jq be a closed time 
interval, Jq = [0, a] . Suppose f{t, x) is a uniformly Lipschitz function with constant K 
for all {t,x) S Jo X Bb[x{,). The IRPI approach is applied to obtain an approximating 
solution. For iteration j , empirical eigenf unctions, which is used to span a linear 
subspace , are calculated by SVD of the information matrix \3.16\) . Let x(t) 
be the solution of the full model, x^ (t) = x{t) + e^(i) be the projection of x{t) onto 
, and x^ (t) ~ x{t) + e^(i) + el{t) be the solution of the reduced model. Then the 
sequence of functions x^ (t) defined by IRPI approaches to x{t) with an upper bound 
for \\e\\^ given by 



(4.26) 



X 



aee 



Ka 



(1 



72(1 -isTae^'^) l-Kae^"' 



for all t £ Jq provided that 
(4.27) 



a < min 



fj ^-Kb/2M 

2M' 



K 



where M is given by ^J^, is the maximal error of the initial condition in the reduced 
model, and 9 < b/2. 



Proof. Here we follow a similar approach to the proof of tlieorem l4.3l As proved in 
theorem 14. 3[ x{t) and x{t) uniquely exist in the interval Jq. Moreover, x(t) G i?b(xo) 
and i-'^^(t) G Bb{xo). In a general situation, the initial condition of the reduced 
model p.5p can be expressed as Xq = xq + e-^ (O). x^{t), the solution of nearby initial 
condition, still uniquely exists. Moreover, x^~^{t) G Bb{xo), which can be justified by 



(4.28) 



x\t) 



X.0 



< 



x^{t) 



X.0 



Xq 



< b/2 + b/2 = b. 



Compare with ()4.8|) . the governing equation of eo{t) has a more complex form, 

ei = (/ - P^ifit, X + e^-') - fit, x)] -{I- P^)f{t, X + e-'"-!). 

Under a non-ideal situation, the second term on the right hand side is nonzero since 
p.lOp is no longer valid. For the same reason, the initial value of e^(0) is nonzero. 
Expending e^(i) and using the inequality (|4.1ip . one obtain 



(4.29) 



H{t + h)\\ - \\ei{t)\ 



<K\\e^~^{t)\\ + \\{I - P^)f{t,x + eJ-^)\\ + 0{h), 
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where the 0{h) term may be uniformly bounded independent of ^(t). Integrating 
(|4.29p with respect to t yields 

(4.30) \\ei{t)\\<K f \\eJ-\T)\\ dr + f - P^)f{T, x + e^-^)\\ dr + \\ei{0)\\ . 
Jo Jo 

As shown in (|4.13p . the first term on the right hand side is bounded by Ka ||e-'^^||^. 
By the definition of e (|4.20p , the second term on the right hand side is also bounded 
for all t € Jq, 



(4.31) 



Ul - P^)f{T,x + e^-^)\\ dT<ae/j^ 



Combining (gS]), (|4l^ , (|430)) and gJU, one obtains 



(4.32) 

If the initial error is bounded by 



aee 



Ka 



+ e^M|e^„(0)||+ em 



HiO)\ 



< 



m 



< 



then 
(4.33) 

Considering 



le^'ll < Kae^^We^-^ 



aee 



Ka 



Kae'"' < Kae""'^^" < 1, 



9oe^" + I 



the total error is bounded by 



(4.34) 



< 



aee 



Ka 



r,Ka 



"f^il ~ Kae^^"-) 1-Kae"^°- 1 ~ Kae 



Ka ■ 



Since 9 > Oo and 6 > 6i, then a sequence of functions ||e-' (i)|| is bounded by a constant 
X as j — > cxj, where x is given by (|4.26p . □ 

Remark 4.6. Proposition l4.5l provides an local error bound for the IRPI method. 
The first term on the right hand side of (|4.34p is introduced by the truncation error. 
By decreasing the width of time intervals among neighboring snapshots and increasing 
the number of POD modes, we can limit the value of e smaller than any positive real 
numbers. The second term is the magnified error caused by e^(0), the initial error 
orthogonal to . If xq does not resides in 5*^, then e^(0) = (/ — P^)xq ^ 0. The 
third term in (|4.34p comes from the initial error on . It vanishes if the projection 
of xq onto is the starting point for the reduced model, i.e., Xq = Pxq. However, 
if Xq is obtained by numerical calculation from a previous time, the third term can 
be nonzero. In order to balance the truncation error of the vector field and initial 



condition, 7 in p.l6p can be set to 7 = 1. When e = 0, and ||e^(0)|| = 0, proposition 
4.51 degenerates to the case in theorem 14. 3[ i.e. x = Oj which means the sequence of 
functions x^ {t) converges to the fixed point x{t) for all t G Jo- 
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At the first glance, the error bound x is only determined by the initial error 9 
and truncation error of the vector field. The proof of convergence in theorem 14.31 and 
its proposition 14.51 do not require that ff~^{t) resides in for any t £ (0,T]. As an 
alternative to (j3.16p . a more straightforward form of the information matrix for the 
extended data ensemble can be written as 

(4.35) ■.= [xo,-iF^]. 

In this case, the new sequence of functions can still converge to x{t). The main reason 
we use (j3.16l) is that we want to find an optimal subspace that minimize the error of 
the reduced model for the entire trajectory. Specifically, as a Picard-type iteration, 
IRPI can only guarantee that the next iteration can decrease the error for a limit time 
interval Jq = [0,a]. Then, (|4.17p can be rewritten as 

lie-'' II 

(4.36) „". '°° < ifae^°. 

Ile^-'lloo " 

When a > 1/K, the left side might be greater than 1. Thus, although {x) has less 
error locally, it might be worse than x^~^{x) in a global sense. By adding consequent 
solution vectors in the extended data ensemble, we have x^~^ C . Consequently, 

(4.37) ||e^oW||<||e^-'W||- 

Using ()4.6|) in lemma l42l and assume ||ei(0)|j = 0, one obtains 

(4^38) Wt^^'"'- 

(|438l) still can not grantee that the ||e-'''^(i)|| < \\eHt)\\ for aU t e J. However, (|438| 
provide a stronger bound than (j4.36|) when t > l/K, which can effectively decrease 
the global error. 

Sofar, we proved the convergence of IRPI for a local time interval Jq = [0, a]. Since 
the estimates used to obtain Jq are certainly not optimal, the true convergence time 
interval is usually much larger. Suppose J = [0,T] C J„i, where is the maximal 
interval of existence of the original solution. We now prove that the convergence 
region Jq can be extended to J. 

Theorem 4.7 (Global Convergence of IRPI). Consider solving the initial value 
problem given by \2.1\) for the interval J = [0,r]. Suppose the vector field f{t,x) is 
locally Lipschitz on D' , where D' is an open set that contains {t,x{t)) for all t £ J. 
The IRPI approach is applied to obtain an approximating solution. At iteration j , let 
x{t) be the solution of the full model, x^ (t) = x{t) + e^^{t) be the projection of x(t) onto 
, and x^ (t) = x{t) + e^(t) + e^(i) be the solution of the reduced model. In an ideal 
situation, xq G 5*^ . Furthermore, the solution and vector field satisfy x^^^{t) C 
and f{t, x^^^) C for all t G J-'^^, respectively. Then the sequence of functions 
x^ (t) uniformly converges to x{t) for all t e J. 

Proof We first define a closed set E = {{t,x{t)) : t e J} d D' . Let d be the 
distance between E and the boundary of D'. Since D' is an open set, we have d > 0. 
Thus, we can choose a positive constant b such that < 6 < d. Since f{t,x) is 
locally Lipschitz on D' and E' = {{t,y) : t ^ J,y ^ Bi,{x{t))} C D' is compact, then 
f{t,x) is Lipschitz on E' . Let K denote the Lipschitz constant for {t,x) G E' . In 
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addition, we can choose the value of a bounded by (|4.7p . Suppose J* C [0,r] be the 
maximal interval containing t ~ such that for all t £ J*, (t) — > x{t) uniformly 
as j — >■ c». Theorem 14.31 indicates that IRPI will generate a sequence of functions 
x^{t) that uniformly converges to x{t) for all t G Jo = [0,a]. For this reason, we have 

J* ^(D. 

Now assume J* ^ J . Then there exists a i ; S J such that ti G J* , ti + a < T and 
U + a ^ J* . U € J* means for every k > there exists a integer Mi(k) > such that 
for all j with j > AIi{k), x^{ti) uniquely exists and — a;(<i)|| < k. 

Considering the initial value problem 

(4.39) y = /(<,y); y{Q) = v^ = x{U). 

The corresponding reduced model of IRPI at iteration / is given by 

(4.40) =P'f{t,v')- y'(0) = yo + e', 

and ||e'|| < k! . Proposition 14.51 implies that for alH G Jo = [0,a], ||y'(^) — 2/(i)|| < X 
as Z — > oo, where x is given by 14.261 Formally, for every > 0, there is another 
positive integer M2{k!) such that whenever / > M2{n'), then ||y'(t) — v{t)\\ < x + 
Plug (|4.26p into this inequity, we have 



(4.41) \\y\t)^y{t)\\< 



aee^" (1 + e^")K 



72(1 - JCae^'') 1-Kae 



for any t G Jq. In an ideal case, the truncation error is zero, i.e., e ~ Q. Then, the 
right hand side of (|4.4ip can be arbitrarily small. Note that y{t) = x{ti + t) and 
y^{t) = x^ {ti + 1). Therefore, for every e > 0, there exist an integer 

(l-i.ae^-)A 

2 + 2e^'^ / V2 



(4-42) N{e) = Ml ( ^^^^.^ ) + ( ^ 



as long as j > N{e), then ||x^(t) — x(t)|| < e for all t G [0,^^ + a]. However, this 
contradicts with our assumption that ti + a < T and ti + a ^ J* . Therefore, J* ~ J, 
i.e., x^ (t) uniformly converges to x^{t) for all t £ J. □ 

4.3. Computational Cost. In the previous subsection, we have shown that 
IRPI results in a series of approximating solutions that converges to an actual solu- 
tion. Next, we will study the computational cost associated with the proposed IRPI 
method. In fact, a reduced-order equation formed by Galerkin projection can obtain 
fast computation only when the analytical formula of reduced vector field f{t, ^z) 
can be simplified, especially when f(t,x) is a polynomial in x. Furthermore, since 
IRPI is an iterative solver, IRPI can not achieve speedups unless it obtains convergent 
results in a few iterations. This subsection analyzes the cost of IRPI for solving an 
initial value problem, then gives some clues for optimized applications. 

Assume an initial trial solution is given. In algorithm [3l the cost of IRPI for each 
iteration mainly involves four procedures: construct of the information matrix for an 
extended data ensemble (Ci, step 2-4), compute of empirical eigenfunctions (C2, step 
5), project the original system onto a subspace and solve the reduced model (C3, step 
6-7), obtain an approximating solution in the original coordinate system (C4, step 8). 
The total cost is 

(4.43) C = RiCi+C2+C3 + Ci), 
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where R is the number of iterations. 

In terms of computational cost, C3 takes the main contribution. Let j{n) be the 
cost involved in computing f{t,x), the vector field in R". The cost of computing 
vector field in a reduced subspace, f{t,^z), is denoted by j{k,n). It is possible 
that 7(fc, n) > 7(71) if f{t, x) is not a polynomial operator. If an explicit scheme is 
applied for time integration, 

(4.44) C3 = 0{N^k,n)), 

where N is the total number of time steps. 

The other procedures are carried out only once at the beginning of each iteration. 
Since an extended data ensemble is computed in the original space, the computational 
cost to compute the vector field at m snapshots is 

(4.45) Ci 0(7717(71)). 

m <^T should be satisfied for cost saving purposes. The cost of constructing a set of 
empirical eigenfunctions depends on the specified technique used. Suppose m <^ n, 
POD basis can be obtained by 0{h?n) flops by SVD [5T], where h denote the number 
of columns in the information matrix. Since the information matrix defined by IRPI 
has 2m columns, we have 

(4.46) C2 = 0{Arr?n). 

Finally, in order to obtain an approximating solution in the original coordinate system 
and collect the snapshots for the next iteration, we need compute x{ti) = ^z(ti) for 
771 snapshots. The corresponding cost is 

(4.47) C4 = 0{mnk). 

Inserting equations (|4.45p - (|4.47p into (|4.48|) . one can obtain the total cost for IRPI. 

(4.48) C = 0{R- (7717(71) + 4771^71 + N-yik, n) + mnk)) . 

C2 and C3 are inherited from the classical Galerkin-POD approach while construction 
of the extended data ensemble need an extra cost Ci and C4. Usually, Ci + C4 <C 
C2+C3. Since algorithmic] is an iterative algorithm, it introduce a fact R for iteration 
times. Considering the cost of the full model is given by C'(iV7(7i)) for an explicit time 
integration, roughly speaking if i? ■ C3 ^ 0{Nj{n)), IRPI can obtain significantly 
speedup. 

Sometimes an implicit scheme for integrating an ODE is preferable to an explicit 
method for stability reasons, and also due to capability of using larger time steps 
during time integration. In terms of computation cost, one only need update the ex- 
pression for C3. More discussion of the cost involves Galerkin-POD approach appears 
in [3T]. KevrekidisI 

It has been noticed that even for a polynomial vector field, POD-Galerkin ap- 
proach might not be able to significantly accelerate computation since it results in 
small but full matrices (tensors) while common discretization techniques usually gen- 
erate large but sparse ones. This effect is even considerably increased when treating 
with POD based IRPI. whose computational cost is roughly R times the standard 
POD-Galerkin approach. For the purpose of computational saving, it is preferable 
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to substitute standard POD by a more efficient snapshot-based technique for dimen- 
sional reduction in algorithm |31 Within the framework of IRPI, one simple approach 
to effectively decrease iteration number R is to set up a "good" trial solution initially. 
For example, if the full model is obtained by finite a difference method with a large 
number of grid points, a coarse model can be set as the same scheme with smaller 
number of grid points and a larger time step. Instead of setting up the initial trial 
solution to be a constant, a better trial solution can be obtained by the coarse model. 
This idea is illustrated in simulations in the next subsection. 

4.4. Numerical Results. The proposed algorithm, IRPI, is illustrated in this 
subsection by two numerical examples: linear advection-diffusion equation and non- 
linear Burgers equation. Both examples focus on demonstrating the capability of 
IRPI to deliver accurate results using reduced models. We also show an application 
of IRPI for posteriori error estimation of a coarse model or reduced model. 

Let u ~ u(t,x) denote a scalar field. Consider the advection-diffusion equation 
with constant moving speed c and diffusion coefficient v, 



on space x £ [0, 1]. Without loss of generality, periodic boundary condition is applied, 

, , u(t,0) 1), 

^ ' u,{t,0)=u.,{t,l). 

The initial condition is provided by a Gaussian function 

(4.51) u(0,x) = e"^(^^T7F^) _ 

The fully resolved model in higher dimensional space is obtained through a high 
resolution finite difference simulation with spatial discretization by n equally spaced 
grid points. The advection term is discretized by an explicit first-order upwind scheme 
while the diffusion term is calculated using the implicit Crank-Nicolson scheme with 
second-order central difference. This results in an ODE of the form, 

(4.52) y = Ay. 

We slightly abuse the notation and denote y G R" as the discretized state vector, 
A £ R"^" is a linear operator in the original space. The associated reduced-order 
equation is 

(4.53) i = A'z 

where z G M'"' is the representation of state in a reduced space, A' e R'^^'' is the 
reduced operator and k <^n. 

For our numerical experiments, we consider a system with c = 0.5 and v = 10"'^, 
which gives rise to a system with diffusion as well as advection propagating to the 
right. This can be seen from Figure l42l a). where the initial condition as well as the 
solution at t = 1 are shown. The full model (reference benchmark solver) is computed 
through n = 500 grid points. In order to initialize IRPI, a smaller simulation is carried 
out by finite difference method with a coarse grid of fc = 20 with larger time step. 
A filtered solution is constructed by extracting the first 10 Fourier modes, which is 
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Figure 4.2. (Color online.) (a) The velocity profiles att = andt = 0.5 of the one- dimensional 
advection- diffusion equation with constant speed c = 0.5 and diffusion coefficient u = 10~^. N = 500 
grid points are used to obtain the full model for the fixed space domain [0, 1] . The coarse solution is 
obtained by extracting the first 10 Fourier modes from the coarse simulation based on n = 20 grid 
points, (b) Convergence of IRPI. Plot of the maximal difference between the benchmark solution 
u(t) the iterative solution u^ (t) fort S [0,0.5], ||e^||oo = sup{||ti^(t) —u(t)\\rx '■ t S [0,0.5]}. Roughly 
speaking It takes only 1 iteration for IRPI to obtain the convergent solution. 




Figure 4.3. ( Color online.) (a) The velocity profiles att = and t = 0.5 of the one- dimensional 
advection equation with constant speed c = 0.5. A' = 1000 grid points are used to obtain the full 
model for the fixed space domain [0,1]. The coarse solution is obtained by extracting the first 10 
Fourier modes from the coarse simulation based on n = 20 grid points, (b) Convergence of IRPI. 
Plot of the maximal difference between the benchmark solution u{t) and the iterative solution u-'{t), 
Ik-' II oo — sup{||ii^(t) — w(t)||oo '. t £ [0,0.5]}. Roughly speaking It takes only 1 iteration for IRPI to 
obtain the convergent solution. 



used as the initially trial solution for the IRPI method. During the first iteration, the 
full-order equation is projected onto a subspace spanned hy k — dominant modes 
and a better approximation is obtained. The convergence plot for IRPI is shown in 
Figure l4^ b). which accords with our theoretical analysis that the sequence of the 
functions u^{t) approaches to the actual solution u{t) after a few iterations. 

When — > the advection-difFusion equation will degenerate to the advection 
equation. We considered the system with c = 0.5 and v ^ Q. The initial Gaussian 
wave propagates to the right without any dissipations. This can be seen from Figure 
14.3( a). where the initial condition as well as the solution at i = 1 are shown. The 
full model (reference benchmark solver) is computed through n — 1000 grid points. 
We still use fc = 20 grid points for a coarse simulation and take the first 10 Fourier 
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modes to construct the trial solution. Different from the actual solution, the trial 
solution has significant diffusion effect. During the first iteration, the full-order equa- 
tion is projected onto a subspace spanned by A: = 12 dominant modes and a better 
approximation is obtained. Figure I4.3r b') compares the maximal difference between 
the benchmark solution u{t) the iterative solution u-' (t) for t e [0,0.5] in the first 10 
iterations. 

Viscous Burgers equation is very similar to the advection-diffusion equation ex- 
cept that the advection velocity is no longer constant. The general form of the one- 
dimensional Burgers equation is given as 

where v is the diffusion coefficient. Let 57 — [0, 1] denotes the computational domain. 
Periodic boundary conditions (|4.50|) are applied. The Gaussian function (|4.5ip is used 
for the initial condition. 

The fully resolved model is obtained through a high resolution finite difference 
simulation with spatial discretization by n equally spaced grid points. We still use the 
explicit first-order upwind scheme to compute the advection term and implicit Crank- 
Nicolson scheme with second-order central difference to compute the diffusion term. 
Truncating the projection of velocity filed u{t,x) on the linear subspace spanned by 
<I> = [01, (pk], we obtain 

fc 

(4.55) u{t,x)=Y,^jit)^ji^)- 

Substituting this approximation into (|4.54l) and using Einstein summation convention, 
the evolution of the reduced coordinates z G M.'' is given by 



(4.56) zi = -AiijZiZj - BijZj. 

hij 

(4.57) 



where A^j G M'^x'^x'^ and Bi^ e K*-'^*-' are constant operators. 



Bii = u{4>r{x),<j,[{x))^. 

Here (•, denotes the inner product of two vectors. 

In the simulation we set v — 10~^. The full model is obtained through n = 
2000 grid points while the trial solution is obtained by extracting the first 10 Fourier 
modes from a coarse simulation with k = 100 grid points. Since one-dimensional 
Burgers equation has a positive velocity, a wave will propagate to the right with 
the higher velocities overcoming the lower velocities and creating steep gradients. 
This steepening continues until a balance with dissipation is reached, which is shown 
through the velocity profiles at t = 0.5 in Figure IT^ a). Since the solution of Burgers 
equation shows a high variability with time evolution, more modes are needed in 
order to present all the snapshots with high accuracy. Meanwhile, IRPI require more 
iterations than the linear equations to obtain the convergence. Equation (|2.20p gives 
adaptive dimension k at each iteration, they are 30, 59 and 92 for the first three 
iterations. The convergence plots for IRPI is shown in Figure H^ b). In the first few 
iterations, the error of the approximating solution decreases, and then converges to a 
fixed value, which is mainly determined by the truncation error of the SVD. In order 
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Figure 4.4. (Color online.) (a) The velocity profiles att = andt = 0.5 of the one- dimensional 
Burgers equation with constant diffusion coefficient v = 10"'^. A'' = 2000 grid points are used to 
obtain the full model for the fixed space domain [0, 1] while n = 100 grids are used to obtain the 
coarse model. The first 10 Fourier modes are extracted to construct the coarse solution. The first 
three iterations respectively take n = 30, n = 59 and n = 92 dominant modes, (b) Convergence of 
IRPI. Plot of the maximal difference between the benchmark solution uit) and the iterative solution 
uHt), ||eJ||^ = sup{\\u^t) - u{t)\\ao :tG [0,0.5]}. 



to obtain higher resolution, for each iteration, we need more snapshots to construct 
the information matrix and keep more modes for the associated reduced model. 

In the rest of this subsection, we illustrate that IRPI can serve as an efficient pos- 
terior error estimation of any coarse models or reduced models. We use the difference 
between the actual solution u{t) and the approximating solution u^{t) as a function 
of t to indicate the accuracy of a coarse model (or a reduced model), 

(4.58) \\e'm = \\u{t)-u'\t)\\. 

However, in many applications, the actual solution u(t) is unknown or very expensive 
to obtain. In this case, we could use IRPI to obtain a better approximating solution 
u^{t) and use the difference between u^{t) and 'u°(<) for the error estimation, 

(4.59) \\A°m = \\u\t)-u°m. 

Figures 14.31 and 14.41 demonstrate that error oiu^{t) is significantly smaller than 
error of u'^{t) for both advection-diffusion equation and Burgers equation. Therefore, 
we expect in general ||A°(t)|| can be served as a good posterior estimation for the 
numerical error of ||e''(t)||, for both linear problems and nonlinear problems. More 
generally, the error of iterative solutions iV (t) computed by IRPI, 

(4.60) \\e^{t)\\^\\u{t)^u\t)\\, 

can be approximated by the difference between u^^^{t) and u^{t), 

(4.61) \\A^{t)\\^\\u^+\t)-uHt)\\. 

For this reason, in algorithm [3l the criterion < e is used to indicate 

the convergence of IRPI. 

Come back to the example of one-dimensional Burgers equation. Figure l475] shows 
II A-' (t) II is a good approximation for the actual error ||e^ (i)|I with time evolution for 
t e [0,0.5]. 
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Figure 4.5. (Color online.) Comparison of the actual error ||e^(t)| 



\\u{t) - u^{t)\\ with 



the estimated error \\Ai {t)\\ = \\ui^^{t) — '!i^ (t)|| for t S [0,0.5], where u{t) is the actual solution 
of the one- dimensional Burgers equation computed by 2000 grid points, u^{t) is the coarse solution 
computed by 100 grid points and u-' it) is the iterative solution computed by IRPI. 



Finally, we should emphasize that IRPI is not an isolated model reduction tech- 
nique. It can be combined with many existing algorithms to simplify a high dimen- 
sional complex system. As an extension, based on IRPI, a local manifold learning 
approach is discussed in the next section to compute a set of subspaces with much 
lower dimension where local solutions reside. 

5. Locally Linear Projection. Consider a large-scale dynamical system whose 
solution exhibits vastly different states as it evolves over a large time. In such a case, 
one might expect that the dominant modes of the system to change significantly 
overtime. As a result the snapshots at earlier evolution times might not carry much 
representative information on the state of the system at later times. In this case, two 
options might be considered: 1) build a global information matrix from snapshots from 
t = to the final time of interest; 2) build the overall solution from a set of smaller 
information matrices over smaller time segments which each of them span a subspace 
for a local solution. The classical Galerkin-POD approach takes the first option 
and the corresponding information matrix often needs a large number snapshots in 
order to represent the full trajectory and its vector field. Large number of snapshots 
can give rise to large computation consequently, for two aspects. First of all, the 
computational cost of SVD of information matrix is proportional to the square of 
the number of snapshots, as shown in equation (j4.46p . In addition, a large number 
of snapshots span a reduced space with a relatively high dimension. Accordingly, 
more computational cost is needed to form a reduced model. This effect is even 
considerably increased when treating a nonlinear problem. For example, as in (j4.57p . 
the cost to construct the coefficient Auj for the reduced model of Burgers equation 
is proportional to the cube of the dimension of the reduced space, k. If k is large, 
the cost for Galerkin projection may even outweigh the cost to solve the original 
equation. Meanwhile, more modes is computed for each time step, while some of 
which are redundant for a specified time. 

As a consequence, the goal is to provide a reduced modelling technique whose 
complexity is independent of the time evolution. It is noticed that a manifold em- 
bedded in a high dimensional space is locally similar enough to a linear subspace. 
Among many dimensionality reduction algorithms, Locally Linear Embedding (LLE) 
[35) might be the most classical one to discover nonlinear structure in high dimensional 
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Figure 5.1. Graphical description of locally linear projection. Ui C M is an open set that 
contains the local solution x{t) for the time interval t a Ji. ipi : Ui Vi is the linear map defining 
the chart. Vi is an open set in a local subspace Si . The dimension of Si is significantly less than 
the dimension of the subspace than contains the entire solution. 

data by exploiting the locally linear reconstructions. Since an orbit of the dynam- 
ical system (|2.ip is a one-dimensional curve, it is possible to use a linear subspace 
with significant lower dimension A: > 2 to represent the local structure of the curve. 
The concept of time domain decomposition was already introduced in j37| in order to 
study fine dynamics by a set of locally invariant manifolds. Adaptive time partition 
technique is developed in |14| . Moreover, space domain partition [I'd] and parameter 
domain partition [TS] have been applied in POD method. All of these techniques 
can sharply reduce the dimension of the original system. In this section, we combine 
the idea of time domain separation with IRPI and propose a locally linear projection 
(LLP) algorithm for model reduction. Convergence analysis is also provided. Then 
we apply this algorithm for the numerical simulation of the Navier-Stokes equation in 
a cavity flow problem. 

5.1. Algorithm of Locally Linear Projection. Suppose Ai C M" is a sub- 
manifold which contains the solution trajectory x{t) for all t G J. Then in this time 
domain, by definition of the tangent space, the derivative x(t) satisfies x{t) C T^(^t^M, 
which leads to f{t,x) C Tj,(t)7M. Consider an arbitrary xq € M and a chart {U,ip) 
with xq € U, where tp : U ^ V is the map defining the chart. Using the local coor- 
dinate X = ri{z) = ip^^{z,Q) of the manifold, we can obtain the differential equation 
in terms of z. By using the chain rule, we have x{t) = if {z{t))z{t). Then the initial 
value problem becomes 

(5.1) z = r,'{zit))+f{t,rj{z)), 

where = (A^ A)~^ A denotes the pseudo-inverse of a matrix with full column rank. 
The initial condition zq is given by xq = "riizo)- The existence and uniqueness of the 
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solution in (|5.ip has been proved in [T^]. The solution of (|2.ip and (|5.ip are related 
by x{t) — ri{z{t)). So any approximation of z{t) provides an approximation for x{t). 

In this section, the whole time domain J := [0,T] is partitioned into M smaller 
subintervals Ji, Jj\/ with := ti\. We slightly abuse the notation and denote 
by the subscript i the subinterval index. Let <o = ^^^d tjv/ = T, so that J = ijfli Ji. 
Let C be an open set that contains the local solution x{t) for t e Ji. For 
simplicity, we can use a linear map ipi : Ui Vi to construct a chat (Ui,ipi), as 
Figure [5721 Let k = dim(Vi), be the dimension of local coordinate z. In this case, 
r]{z{t)) = ^iz{t) for t £ Ji, where <f>i G R*-'^". When <!> is formed by k ectohormonal 
basis vectors, (|5.ip degenerate to the locally reduced model formed by the Galcrkin 
projection, i.e., 

(5.2) i = $f/(i,<I>,z), 

for t £ Ji. Then IRPI can be applied to find a good approximation of the invariant 
subspace and obtain a convergent solution. Specifically, for each subinterval, we set 
a trial solution, x'^{t) initially. During iteration j an extended data ensemble, Yf~^, 
containing a few snapshots within the subinterval, is constructed from an approximat- 
ing solution trajectory and then served for generating the empirical eigenfunctions of a 
subspace to be used in the next iteration cycle. After locally projecting the full model 
onto this subspace and constructing a reduced model through (|5.2p , the time integra- 
tion of Picard iteration is carried out implicitly to obtain an updated approximation. 
After convergence, one can move forward to the next subinterval. 

Suppose a convergent solution x{t) for subinterval i is obtained by IRPI. Let 
:= x{ti-i) be the starting state of this subinterval and Q ■= x{ti) be the ending 
state. The initial sate for subinterval i is given by 



(5.3) 



xq if i = 1, 
G-i if i > 1. 



There are several options to estimate the trial solution SP{t) for t € Ji, and we just 
list a few here. One can simply set the trial solution for subinterval i as a constant, 
which means 5;°(t) = S^i for t £ Ji (although it is very inaccurate). Alternatively, a 
coarse model can be used to obtain a rough estimation of Xiit). These two methods 
can also be used for IRPI, as discussed in the previous section. The third option is 
to use the time history of solution trajectory to obtain an initial estimation of the 
invariant subspace. Similar to |26[|29j . one can assume the solution for the subinterval 
i approximately reside in the invariant subspace of the previous one. Thus, a set of 
empirical eigenfunctions can be generated by SVD of the solution matrix or informa- 
tion matrix by snapshots in Ji-i. The general form of the extended data ensemble 
is 

(5.4) Y^^m.iiim)]. 

for all t G Ji-i. Especially, if only the starting snapshot and the ending snapshot are 
used to construct the discrctized information matrix, we have 

(5.5) = [(:^-2X^-lnI{t^-2X^-2)^f{U-lX^-l)]■ 



After projecting the full model onto this subspace, we can calculate the trial solu- 
tion for t E Ji. Since time- history-based initialization can not be used for the first 
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subinterval, therefore, we have to combine it with one of the other methods for ini- 
tiahzation. 

After obtaining a trial solution for a subinterval, IRPI is used to obtain a better 
approximation of the actual solution. For iteration j, the information matrix is given 

by 

(5.6) Yf ^[x={t),^f{x=m, 

for all t E Ji. It is noticed that the reduced subspace of a local solution has a 
significantly lower dimension when the width of the corresponding subinterval is small 
enough. In this case, only a small number of snapshots m is needed from the trajectory 
of the previous iteration. For example, if m = 2, the information matrix Yf can be 
constructed from snapshot tt-i and tt, 

(5.7) Yi = [C,^ui'{t^)J{U-^,C^-l)J{U,x'{t,))]. 

For the next step, POD (SVD) can be used to construct the empirical eigenvalues $^ 
for the reduced subspace. Theoretically the dimension of the reduced subspace can 
be lower than 2m. However, as long as m is small enough, say m ~ 2, there is no need 
to apply POD for further dimension reduction here. Instead, $^ can be computed 
more efficiently by the Gram-Schmidt process. Then the full model is projected onto 
the subspace spanned by and a local reduced model is obtained. The algorithm |4] 
represents the complete process of LLP. 

5.2. Convergence Analysis. We have already shown the capability of IRPI to 
effectively find a global invariant subspace for a dynamical system, and thus generate a 
sequence of functions that converges to the actual solution. Since LLP is an extension 
of IRPI, a convergent solution can be obtained for each subinterval, and a combination 
of these local solutions provides a full trajectory for the original system. 

Begin with the first subinterval. In an ideal situation, G S'f , while the solution 
and vector field satisfy x^^^{t) C Sf and f{t, x^^^) C for all t E Ji, respectively. As 
thcorcm l4. 71 indicates. LLP can obtain the convergent solution for t G Ji. If the vector 
field in the initial value problem (|2.ip is Lipschitz, then the solution x{t) ~ w{t; x{ti)) 
continually depends on the initial condition x{ti). For this reason, starting from ^2, 
we can obtain a sequence of functions that converges to x{t) for t G J2. We can move 
forward to the rest of subintervals, and achieve the following theorem. 

Theorem 5.1 (Convergence of LLP). Consider solving the initial value problem 
\2.1\] by LLP for the time domain J = [0,T] , which is partitioned into M smaller 
subintervals Ji,..., Jm with Ji := [ti^i^ti]. Let to = and tm = T, so that J = 
^iLiJi- Suppose the vector field f{t, x) is locally Lipschitz on D' , where D' is an open 
set contains {t,x(t)) for all t £ J. For subinterval i, the IRPI approach is applied 
to obtain an approximation for local solution. Let x{t) be the local solution of the 
full model, and x^ (t) be the solution of the reduced model at iteration j. In an ideal 
situation, E Sj , where ^i, the initial condition of subinterval i, is given by fiS.S]) . 
Furthermore, the solution and vector field satisfy x^'^it) C Si and f(t,x^~^) C Sf 
for all t G Ji-i, respectively. Then for all i G {1, Af} and t G Ji, the sequence of 
functions x^ {t) uniformly converges to x{t) as j — !> 00. 

Finally, it is interesting to consider LLP as a generalization of many current time 
integration schemes. We take m = 2 as an example. The time-history-initialization 
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Algorithm 4 Locally linear projection (LLP) 
Require: The initial value problem (j2.1|) . 
Ensure: An approximating solution x{t). 

Divide the whole time domain into smaller subintervals Ji, Jm 

for subinterval i do 

Set the initial condition (|5.3p and obtain a test function x'^ (t) as the trial solution. 

Initialize the iteration number j = 0. 

repeat 

1: Update the iteration number j = j + 1. 

2: Assemble snapshots of an approximating solution x^~^{t) into matrix form 
3: Compute vector field matrix whose columns associated with snapshots 

in xr\ 

4: Form an information matrix for the extended data ensemble Yf~^ = 

5: Based on Y^^ , compute the empirical eigenfunctions through Gram- 
Schmidt process. 

6: Project the original equation onto a linear subspace spanned by $^ and form 
a reduced model (|4.2p . 

7: Solve the reduced model and obtain an approximating solution zj (t) in the 
subspace coordinate system. 

8: Express the updated solution in the original coordinate system xl{t) = 

until Le^ — a;^ ^ < e, where e is the tolerant error. 
Obtain the local solution x{t) = x^{t), for t e J^. 
end for 



provides a linear subspace spanned by Y^^ to estimate the trial solution for t ^ Ji. 
Especially, when t = ti, the initial estimation of the state vector is given by 



(5.8) 



where 1^° is given by (|5.5p and is a vector that contains 4 elements. Let 6T denote 
the average width of one subinterval, 5T := (T — ti)/M; and 6t denote the width of 
one step for the time integration. In the limit when 6T = St, LLP degenerates to the 
two-step Adams-Bashforth scheme if 



0,1,- 



1 



1 T 



27(5t' 2jSt 



On the other hand, if one uses the IRPI technique to obtain a better estimation at 
t ~ ti, the approximating solution is given by 



(5.9) 



where it is assumed only two snapshots are used to construct the information matrix 
y/, as expressed by (|5.7p . In the limit when ST = 6t, LLP degenerates to the Crank- 
Nicholson scheme if 



1,0, 



1 



1 



2jSr 2jSt 
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More generally, assume n snapshots are sampled from the previous subinterval, then 
in the limit 6T = 5t, the time-history initialization can degenerate to the n-step 
Adams-Bashforth method if proper coefficients are set for <^°. In addition, if the Y^^ 
has n+1 snapshots from Ji, and the first n snapshots are overlapping with Ji_i, then 
each iteration defined by IRPI can degenerate to the n-step Adams-Moulton method. 
Furthermore, if ST = nSt, then each iteration defined by IRPI is a generalized form 
of the n-order Runge-Kutta method with variable coefficients. 

However, as a manifold learning technique, LLP determines a set of eigenfunctions 
for a subspace and then use a reduced model to calculate the coefficient values for the 
next subinterval. This is more fiexible than a common scheme for the time integration 
since the later uses predesigned coefficients for each vectors in the information matrix. 
Therefore, LLP has the ability to provide more stable results for a fixed time interval. 
Even if 5T ^ St, LLP can still obtain stable results with a high accuracy. In the next 
subsection, the LLP approach is applied to a lid-driven cavity flow problem. 

5.3. Cavity Flow Problem. Consider a lid-driven cavity flow problem in a 
rectangular domain = [0, 1] x [0, 1]. The space domain is fixed in time. Mathemati- 
cally the problem can be represented in terms of the stream function ip and vorticity w 
formulation of the incompressible Navier-Stokcs equation. In non-dimensional form, 
the governing equations are given as 

^ d(jj dtp du! dtp du! 1 / d^oj d^uj 

dt dy dx dx dy Re \ dx"^ dy'^ 

where. Re is the Reynolds number, and x and y arc the Cartesian coordinates. The 
velocity field is given by u = dtp/dy, v = —dtp/dx. The zero-slip condition at the 
nonporous walls yields the following boundary conditions for tp, 

(5.12) tPB=0, 

where B denotes the points on the walls. The boundary conditions for lo are derived 
from the definition of lo as given by (jS.lOp . The first-order-corrcct boundary conditions 
are given by 

(5.13) u:b - ii^B-i - ^^3-2)^ + (^1^) + 0{h). 

where B ~i denotes the interior points whose distance from the boundary equals ixh, 
and {dtp/dn)B is the tangent velocity of the boundary. At y = 1, {dtp/dn)B — 1. 
(dtp/dn)B = on other walls. The initial condition is set as u{x,y) — v{x,y) = 
0. The discretization is performed on a uniform mesh with a second-order central 
finite difference approximations for second-order derivatives in (|5.10p and (|5.1ip . The 
convective term in (j5.1ip is discretized via a first-order upwind difference scheme. For 
the time integration of (|5.1ip . the implicit Crank-Nicolson scheme is applied for the 
diffusion term, and the explicit two-step Adams-Bashforth method is employed for 
the advection term. 

In the numerical simulation, Reynolds number is set as Re= 1000. The full model 
uses 129 X 129 grid points with 6t ~ 5 x 10"^ as the unit time step. The whole time 
domain. [0, 50], is divided into 250 subintcrvals. For each subinterval, the trial solution 
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Figure 5.2. Streamline pattern for driven cavity problem with _Re=1000. (a) The full model uses 
129 X 129 grid points, (b) The approximating result obtained through LLP. The whole time domain, 
[0,50], is partitioned into 250 subintervals. For each subinterval, a trial solution is calculated from 
33 X 33 grid points. An average of 5 iterations are used to achieve a better approximation. We 
plot the contours of tp whose values are —1 X 10^^'^, —1 X 10~'^, —1 X 10~^, —1 X 10"*, —0.01, 
-0.03, -0.05, -0.07, -0.09, -0.1, -0.11, -0.115, -0.1175, 1 x 10"**, 1 x 10"^, 1 x lO'^, 1 x 10-^, 
5 X 10"■^ 1 X 10""', 2.5 X 10""', 1 x 10-3, 1.3 x 10"^, and 3 x 10"^. 




Figure 5.3. (a) Comparison of velocity component u{x = 0.5, y) along the y-direction passing 
geometric center between full model and reduced models at t = 50. (b) Comparison of velocity 
component v{x,y = 0.5) along the x-direction passing geometric center between full model and 
reduced models att = 50. Velocity profiles based on the coarse model, which uses 33 X 33 grid points, 
significantly deviate from the actual ones. For each subinterval, LLP uses the coarse model for the 
trial solution, and then use the IRPI approach to obtain a better local solution with high accuracy. 



is obtained through a simulation based on 33 x 33 coarse grid points with A5t as the 
unit time step. A sequence of functions defined by LLP is used to approach the local 
solution. For each iteration, 3 snapshots are taken to span the local subspace. Instead 
of SVD, Gram-Schmidt process is used to form the local empirical eigenfunctions. An 
average of 5 iterations is carried out to obtain a local convergent solution for each 
subinterval. 

The streamline contours for the lid-driven cavity flow are shown in Figure 15.21 
In I5.2r a). the full model matches well with the simulation results from [l^, and 
the values of ip of the contours are the same as table III of [l6l. LLP provides an 
approximating solution. The main error occurs around the vortex center, where the 
contour of ijj = —0.1175 is missing. Figure [5751 shows the velocity profiles for u along 
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Figure 5.4. Plot of the time evolution of the vorticity error. The solid (red) line shows the 
norm-2 difference between the benchmark solution u){t) and the coarse solution cjO(t) Af = 129 X 129 
grids are used to obtain the full model with time step (5t = 5 X 10~^ . The coarse model uses 
n = 33 X 33 grids with time step 45t. The dashed line shows the norm-2 difference between the 
and the approximating solution ui{t) obtained by LLP. An average of 5 iterations are used to obtain 
the convergence for all the subintervals. 

vertical lines and v along horizontal lines passing through the geometric center of the 
cavity. We compare the approximating solution obtained by LLP with the coarse 
results which are obtained by the 33 x 33 grid points. Starting from the coarse 
solution, LLP is able to obtain a much more accurate results for u{x = 0.5, y) and 
v{x, y = 0.5). Finally, Figure [5^ shows the L2 error for the time evolution of vorticity 
for both coarse model and LLP. In terms of vorticity, LLP can significantly reduce 
the error from the coarse model with an average of 5 iterations. 

6. Conclusion. In this article, a new online manifold learning framework, im- 
plicitly reduced Picard iteration (IRPI), is proposed for reduced-order modeling of 
large-scale nonlinear problems where both the data sets and the dynamics are system- 
atically reduced. This framework does not require prior simulations or experiments 
to obtain data vectors. During each iteration cycle, an approximating solution is 
calculated in a low dimensional subspace, and provides many snapshots to construct 
an information matrix. A POD (SVD) method can be applied to generate a set of 
empirical eigenfunctions which span a new subspace. In an ideal case, a sequence of 
functions defined by IRPI uniformly converges to the actual solution of the original 
problem. We also provide an error bound for IRPI, considering the truncation er- 
ror. The capability of IRPI to solve a high dimensional system with high accuracy 
is demonstrated in both linear advection-diffusion equation and nonlinear Burgers 
equation. Moreover, IRPI can also be used as an efficient posterior error estimator 
for any other coarse models or reduced models. 

In order to reduce the cost of the IRPI approach, the LLP technique is developed 
as an extension of IRPI. Using the idea of partitioning the time domain, the IRPI 
technique is used to obtain a better approximation for each subinterval. Since each 
subinterval has much less solution variations, the associated subspace has significantly 
lower dimension that the overall problem over the entire time domain. The numerical 
results of nonlinear Navicr-Stokcs equation through a cavity flow problem imply that 
the LLP can obtain significant speedups while maintaining a good accuracy. 

There arc still many other related questions to study in future. For example, since 
the choice of the extended data ensemble is not unique, there might be other methods 
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to form the information matrix which rcsuhs in more efficiently reduced model. It 
is noticed that POD (SVD) is not the unique technique to find the dominant modes 
based on the information matrix defined in this paper. How to combine IRPI with 
other fast model reduction techniques remains a topic for future research. 
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